Model-Driven Auto-Tuning of Stencil Computations on GPUs

نویسندگان

  • Yue Hu
  • David M. Koppelman
  • Steven R. Brandt
  • Frank Löffler
چکیده

Stencil computations are a class of algorithms which perform nearest-neighbor computation, often on a multi-dimensional grid. This type of calculation forms the basis for computer simulations across almost every field of science. The increasing computational speed of graphics processing units (GPUs) make their use for stencil computations an interesting goal. However, achieving highly efficient implementations is often nontrivial, as numerous publications attest. In this work, we propose an analytic performance model for stencil codes on GPUs, which both delivers close-to optimal performance, but at the same time does not require extensive tuning at compile or run time. We evaluate the effectiveness of our performance model using different stencil benchmarks and with various stencil radii.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Auto-tuning Jit Compiler for Accelerating Multiple Stencil Computations

We present a JIT compiler with auto-tuning capabilities fusing multiple stencil computations. Data arrays for scientific computing of image processing often exceed cache-memory size. To take advantage of spatial and temporal locality, a common method is to partition the images into tiling blocks for multicore architectures. In realistic scenarios, the multiple image algorithms, most of which ar...

متن کامل

A Generalized Framework for Auto-tuning Stencil Computations

This work introduces a generalized framework for automatically tuning stencil computations to achieve superior performance on a broad range of multicore architectures. Stencil (nearest-neighbor) based kernels constitute the core of many important scientific applications involving block-structured grids. Auto-tuning systems search over optimization strategies to find the combination of tunable p...

متن کامل

PATUS: A Code Generation and Auto-Tuning Framework For Parallel Stencil Computations

PATUS is a code generation and auto-tuning framework for stencil computations targeted at modern multiand many-core processors, such as multicore CPUs and graphics processing units. Its ultimate goals are to provide a means towards productivity and performance on current and future multiand many-core platforms. The framework generates the code for a compute kernel from a specification of the st...

متن کامل

Auto-tuning the 27-point Stencil for Multicore

This study focuses on the key numerical technique of stencil computations, used in many different scientific disciplines, and illustrates how auto-tuning can be used to produce very efficient implementations across a diverse set of current multicore architectures.

متن کامل

Yet another Hybrid Strategy for Auto-tuning SpMV on GPUs

Sparse matrix-vector multiplication (SpMV) is a key linear algebra algorithm and is widely used in many application domains. Besides multi-core architecture, there is also extensive research focusing on accelerating SpMV on many-core Graphics Processing Units (GPUs). SpMV computations have many indirect and irregular memory accesses, and load imbalance could occur while mapping computations ont...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015